Sinusoidal audio coding

ABSTRACT

Coding of an audio signal (x) represented by a respective set of sampled signal values for each of a plurality of sequential segments is disclosed. The sampled signal values are used to determine sinusoidal components (CS) for each of the plurality of sequential segments. The sinusoidal components (CS) are subtracted from the sampled signal values to provide a set of values (s 1,  s 2 ) representing afirst residual component (x 3 ) of the audio signal. The first residual component (x 3 ) is conditioned ( 18 ) to remove selected tonal components and to provide a set of values (s 1 ′, s 2′ ) representing a second residual component (x 3 ′) of the audio signal. The second residual component is modelled ( 14 ) by determining noise parameters (CN) approximating the second residual component (x 3 ′); and an encoded audio stream (AS) is generated including the noise parameters (CN) and the codes representing the sinusoidal components (CS).

FIELD OF THE INVENTION

The present invention relates to coding audio signals.

BACKGROUND OF THE INVENTION

Referring now to FIG. 1, a parametric coding scheme in particular asinusoidal coder is described in PCT Patent Application No. WO01/69593.In this coder, an input audio signal x(t) is split into several(overlapping) segments, typically of length 20 ms. Each segment isdecomposed into transient, sinusoidal and noise components. Thisdecomposition is done sequentially, i.e. the transients are firstextracted from the input signal x(t) in a transient coder 11 to leave a1^(st) residual signal x1/x2 depending on whether gain control isapplied or not; the 1^(st) residual signal is coded using a sinusoidalcoder 13; then the coded sinusoids are extracted from the 1^(st)residual signal to leave a 2^(nd) residual signal x3; this 2^(nd)residual signal is in turn coded using a noise coder 14.

In the sinusoidal analyser 130, the 1^(st) residual signal x2 for eachsegment is modelled using a number of sinusoids represented byamplitude, frequency and phase parameters. Once the sinusoids for asegment are estimated, a tracking algorithm is initiated. This algorithmlinks sinusoids with each other on a segment-to-segment basis to obtainso-called tracks. The tracking algorithm thus results in sinusoidalcodes C_(S) comprising sinusoidal tracks that start at a specific timeinstance, evolve for a certain amount of time over a plurality of timesegments and then stop.

A number of coding methods can be employed in the noise coder to modelthe 2^(nd) residual signal x3. For trasparent audio quality, the noisecoder can be a wave form coder in the form of a filter bank.Alternatively, for good quality and low bit-rate, the noise coder canemploy a synthetic noise model to produce, for example, AutoregressiveMoving Average (ARMA) or Linear Predictive Coding (LPC) filterparameters.

It is also possible to derive other components of the input audio signalsuch as harmonic complexes. The present specification relates only tosinusoidal and noise components, but the extension to harmonic complexesdoes not affect the invention in any way.

The extraction of sinusoids from a segment of an audio signal can beproblematic. Within segments, sinusoidal amplitudes and frequencies canvary and this is referred to as instationarity. Furthermore,inaccuracies can occur in the estimation of the sinusoids. As a result,the spectral suppression achieved using the coded sinusoids is notalways satisfactory or ideal. This results in the presence ofsinusoidal-like components especially at or near the positions of thecoded sinusoids in the 2^(nd) residual signal.

In addition, at low bit rates, where there are only enough bits to codea few sinusoids, sinusoidal components will still be present in the2^(nd) residual.

Noise coders in general model the temporal and spectral envelope of theresidual signal x3 rather coarsely, i.e. they have a limited spectralresolution and artefacts can appear when a noise coder models sinusoidalcomponents. Even if tonal components remaining in the residual aremasked, audible artefacts can occur, due to the limited spectralresolution of the noise model. This is especially likely to occur at lowfrequencies where the auditory system has a good spectral resolution andspectral resolution of the noise coder is usually worse. Also, incontrast to a stationary, tonal signal, the energy of the noisycomponent will always fluctuate over time. These fluctuations may make apreviously masked tonal component audible. Energy fluctuations will bebiggest in regions where spectral resolution should be good, i.e. at lowfrequencies. Thus, apart from the fact that in trying to model thesinusoidal-like components in the residual signal x3, the noise coderrequires additional bits for the noise codes C_(N), modelling thesecomponents as noise may result in audible artefacts, particularly at lowfrequencies.

The present invention attempts to mitigate this problem.

DISCLOSURE OF THE INVENTION

According to the present invention there is provided a method accordingto claim 1.

The invention includes a re-analysis stage prior to the noise coder. Inone embodiment, tonal components are removed from the residual by, forexample, matching pursuit in combination with an energy-based stoppingcriterion which determines when to stop extracting tonal components.

In another embodiment, the residual signal is additionally suppressed atthe frequencies of the coded sinusoids and their surroundings. Thenumber of surrounding frequencies can be fixed or dependent on thefrequency. A psycho-acoustical frequency division (e.g. Bark/Erb bands)can also be used. The amount of suppression can for example depend onthe number of sinusoids, or the energy of the sinusoids. As a result,the noise coder does not need to model these sinusoidal regions anymore.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 shows a prior art audio recorder including an audio encoder;

FIG. 2 shows an embodiment of an audio coder according to the invention;

FIG. 3 shows an embodiment of an audio player including an audio decoderoperable with the coder of the invention;

FIG. 4 illustrates the processing performed by the re-analyser of theembodiments of the invention; and

FIG. 5 shows a system comprising an audio coder according to theinvention and an audio player.

DESCRIPTION OF THE PREFERRED EMBODIMENT

Preferred embodiments of the invention will now be described withreference to the accompanying drawings wherein like components have beenaccorded like reference numerals and, unless otherwise stated perform alike function. In a preferred embodiment of the present invention, FIG.2, the encoder 1′ is a sinusoidal coder of the type described in PCTPatent Application No. WO 01/69593. The operation of this prior artcoder and its corresponding decoder has been well described anddescription is only provided here where relevant to the presentinvention.

In both the prior art and the preferred embodiment, the audio coder 1′samples an input audio signal at a certain sampling frequency resultingin a digital representation x(t) of the audio signal. The coder 1′ thenseparates the sampled input signal into three components: transientsignal components, sustained deterministic components, and sustainedstochastic components. The audio coder 1′ comprises a transient coder11, a sinusoidal coder 13 and a noise coder 14.

The transient coder 11 comprises a transient detector (TD) 110, atransient analyzer (TA) 111 and a transient synthesizer (TS) 112. First,the signal x(t) enters the transient detector 110. This detector 110estimates if there is a transient signal component and its position.This information is fed to the transient analyzer 111. If the positionof a transient signal component is determined, the transient analyzer111 tries to extract (the main part of) the transient signal component.It matches a shape function to a signal segment preferably starting atan estimated start position, and determines content underneath the shapefunction, by employing for example a (small) number of sinusoidalcomponents. This information is contained in the transient code CT andmore detailed information on generating the transient code CT isprovided in PCT Patent Application No. WO 01/69593.

The transient code CT is furnished to the transient synthesizer 112. Thesynthesized transient signal component is subtracted from the inputsignal x(t) in subtractor 16, resulting in a signal x2.

The signal x2 is furnished to the sinusoidal coder 13 where it isanalyzed in a sinusoidal analyzer (SA) 130, which determines the(deterministic) sinusoidal components. It will therefore be seen thatwhile the presence of the transient analyser is desirable, it is notnecessary and the invention can be implemented without such an analyser.Alternatively, as mentioned above, the invention can be implemented withfor example an harmonic complex analyser. In any case, the end result ofsinusoidal coding is a sinusoidal code CS and a more detailed exampleillustrating the conventional generation of an exemplary sinusoidal codeCS is provided in PCT Patent Application No. WO 00/79519.

In brief, however, such a sinusoidal coder encodes the input signal x2as tracks of sinusoidal components linked from one frame segment to thenext. From the sinusoidal code CS generated with the sinusoidal coder,the sinusoidal signal component is reconstructed by a sinusoidalsynthesizer (SS) 131. This signal is subtracted in subtractor 17 fromthe input x2 to the sinusoidal coder 13, resulting in a remaining signalx3.

According to the present invention, there is provided a re-analyser 18,which conditions the residual signal x3 prior to encoding by a noisecoder 14. In each of the embodiments of the invention, the re-analyser18 selectively removes or suppresses spectral regions at or near thepositions of tonal components from the residual signal x3 and provides aconditioned residual signal x3′ to the noise coder 14.

Referring now to FIG. 4, as mentioned above, in the embodiments, theresidual signal x3 provided to the re-analyser 18 comprises segmentss1,s2 . . . overlapping in successive time frames t(n−1), t(n), t (n+1).Typically sinusoids are updated at a rate of 10 ms and each segments1,s2 . . . is twice the length of the update rate, i.e. 20 ms. In eachof the embodiments, the re-analyser 18 provides the overlapping timewindows t(n−1),t(n),t(n+1) to be re-analysed by using a Hanning windowfunction to combine the signals from overlapping segments s1,s2 . . .into a single signal representing a time window, step 42. An FFT (FastFourier Transform) is applied on the windowed signal, resulting in acomplex frequency spectrum representation of the time window signal,step 44. For a sampling rate of 44.1 kHz and a frame length of 20 ms,the length of the FFT is typically 2048.

In a first embodiment, in the re-analyser 18, conditioning of thespectrum generated by the FFT, step 46, comprises applying aconventional type matching pursuit algorithm to iteratively remove peaksfrom the spectrum. In the first embodiment, the algorithm iterativelyremoves those peaks that result in the greatest reduction of energy. Ingeneral this will mean that the matching pursuit algorithm firstextracts peaks corresponding to tonal components and then tends toextract noisy peaks, because the reduction in energy is, on average,bigger for the extraction of tonal peaks than for the extraction ofnoisy ones. Thus, the extraction should stop just after the extractionof all tonal components and just before the extraction of noisy ones. Onthe one hand, if not all tonal components are removed, when synthesisedin a decoder, the signal may be too noisy, because tonal components willhave been modelled by the noise coder 14. On the other hand, if too manyand thus some noisy components are removed, the synthesised signal maysound metallic, because of resulting gaps in unsuitable regions of thespectrum of the residual signal x3′ provided to the noise coder 14.

In one implementation of the first embodiment, a stopping criterionindicates when to stop extracting components. This criterion is based onthe energy of the residual before and after the extraction of a peak.Thus, when the reduction in energy after removal of a peak is less thana certain percentage, this indicates that all tonal peaks have beenextracted and that the conditioned residual x3′ will be free of tonalcomponents.

Since the reduction in energy depends on the length of the analysiswindow, the energy criterion is inversely proportional to the windowlength. For example, for a window length of 1024 sample points at 48 kHz(=21 ms), a useful value for the criterion is at a reduction in energyof 5%, whereas for a window length of 512 sample points at 48 kHz (=10.5ms), it is 10%.

In another implementation of the first embodiment, a fixed number ofpeaks are extracted, i.e. matching pursuit runs through a fixed numberof iterations.

As an alternative to the iterative matching pursuit approach of thefirst embodiment, in a second embodiment, the conditioning step 46 picksand removes a number (fixed or variable (for example all peaks in thespectrum)) of the highest energy peaks from the spectrum generated instep 44 in a single step. This technique has the advantage that it isfaster (being performed in a single iteration) than matching pursuit,however, it may lose the benefit of picking up peaks masked by morepowerful peaks that may be detected by matching pursuit.

In the cases above where a fixed number of peaks are removed eitheriteratively or in a single step, it has been found experimentally thatthe extraction of 5 peaks or less resulted in better, less noisy signalswhile the extraction of more than 5 peaks resulted in a less noisy butmetallic sounding signal.

In all of the above implementations, the re-analyser 18 takes an inverseFFT of the residual spectrum when matching pursuit has completed toobtain a time domain signal, step 48. By applying overlap-add forsuccessive conditioned time domain signals, step 50, the conditionedresidual x3′ is created and this is fed through the noise module 14. Itwill be seen that the conditioned segments s1′, s2′. . . of the residualx3′ correspond to the segments s1, s2 . . . in the time domain and assuch no loss of synchronisation occurs as a result of the re-analysis.

It will be seen that where the residual signal x3 is not an overlappingsignal but rather is a continuous time signal, then the windowing step42 will not be required. Similarly, if the noise coder 14 expects acontinuous time signal rather than an overlapping signal, theoverlap-add step 50 will not be required. Nonetheless, it will also beenseen that the first embodiment can be implemented without requiring anychanges to be made to the conventional sinusoidal coder 13 or the noisecoder 14. Also, in both of the above implementations psycho-acousticconsiderations do not have to be taken into account when conditioningthe signal x3 to produce the signal x3′.

In third and fourth embodiments of invention, while no changes need tobe made to the internal operation of the sinusoidal coder 13, there-analyser 18 is provided with the sinusoidal codes Cs for each segments1, s2 . . . as indicated by the dashed line 52 of FIGS. 2 and 4. Again,sinusoidal codes for successive segments need to be combined to providea single set of values for each time window t(n−1), t(n), t(n+1). In thethird embodiment, for each of the sinusoids that are estimated for agiven time window, as indicated by the frequency parameter for eachsinusoidal component, the conditioning step 46 determines thecorresponding frequency bin in the spectrum derived at step 44. Thefrequency bin is then multiplied by a factor (e.g. 0.001), i.e. severelyattenuated. Also adjacent frequency bins are suppressed (e.g. by afactor of 0.01) and this results in a conditioned complex spectrum. Asbefore, an inverse FFT is applied to this conditioned spectrum, step 48and processing continues as before.

In the fourth embodiment of the invention, the re-analyser 18 isprovided with the original signal for each segment s1, s2 . . . asindicated by the dashed line 56 of FIGS. 2 and 4. In the conditioningstep 46, the frequency bins of the complex spectrum derived at step 44are combined in non-equidistant frequency bands according to apsycho-acoustical model (e.g. Bark, Erb). Per psycho-acoustic basedfrequency band, the energy of the sinusoids derived from the sinusoidalcodes Cs in that band (line 52) and the energy of the original inputsignal in that band (line 56) are compared. Instead of the actualenergies of sinusoids and original in a band, also estimates may beused. A possible estimate of the original energy is the energy of thesinusoidal components plus the energy of the residual. This estimate isonly equal to the actual energy of the residual if the sinusoidalcomponents and the residual are uncorrelated. A possible estimate of thesinusoidal energy is the energy of the original minus the energy of theresidual. Again, this estimate is only equal to the actual energy of thesinusoidal components if the original and the residual are uncorrelatedin that band. If the difference is small (e.g. 2 dB), the frequency binsin the frequency band for the spectrum derived at step 44 are set tozero based on the assumption that in this particular frequency regionthe original signal is described well enough by the sinusoids. A band isalso put to zero if the energy of the sinusoidal components is higherthan the energy of the original. This may, for example happen whendifferent windows are used. As before an inverse FFT can be applied tothis conditioned spectrum, step 48 and processing can continue as beforewith the conditioned time domain signal x3′ being fed to noise coder 14.

However, by setting frequency bands to zero, noise parameters can beencoded very efficiently resulting in a considerable coding gain. Thus,if the conditioned frequency spectra generated at step 46 were feddirectly to an adapted noise coder, the noise coder may be able to applyfor example, run-length coding to take advantage of the gain of a numberof consecutive frequency bands being zero. In existing state-of-the-artnoise coders run-length coding is not applied, because withoutconditioning it only rarely occurs that parts of the residual spectrumare zero. However, by applying spectral blanking, run-length encodingwill result in a considerable bit-rate reduction. Corresponding changeswould of course need to be made to the decoder to take account of anychanges in the coding of noise information.

In a fifth embodiment of the invention, rather than providing thesinusoidal codes Cs to the analyser 18, the sinusoidal coder 13 isadapted to provide to the re-analyser 18 the parameters for sinusoidalcomponents which were detected by the sinusoidal analyser 130 butdropped during the coding process as indicated by the line 54 in FIGS. 2and 4. As well as frequency and amplitude values, these parameters alsoinclude an indication of the reason for dropping the sinusoids. Althoughnot an exclusive list of types, these can include:

-   -   The sinusoid was too short to be useful for tracking (S);    -   The sinusoid was masked by a more powerful sinusoid (M);    -   The sinusoid was dropped to reduce the bit rate. (B).

In the case of types M and B, it will be seen that these components aremore likely to be tonal than in the case of type S. Therefore in thefifth embodiment, the conditioning step 46 comprises removing a number(fixed or variable) of the highest energy peaks corresponding to M and Btype frequencies before providing the conditioned spectrum forprocessing as before in steps 48, 50.

While each of the above embodiments has been described independently, itwill be seen that one or more of these techniques may be combined in theconditioning step 46. For example, the steps of the fifth embodiment maybe performed to remove a limited number of M or B type components beforethe steps of the first embodiment are performed to remove other peaks.

It will also be seen that while each of the embodiments have beendescribed in terms of conditioning the residual signal x3 in thefrequency domain, the re-analyser 18 could equally operate in the timedomain.

In any case, the conditioned signal x3′ produced by the re-analyser 18can now more properly be assumed to comprise only noise and the noiseanalyzer 14 of the preferred embodiment produces a noise code CNrepresentative of this noise, as described in, for example, PCT patentapplication No. PCT/EP00/04599.

Finally, in a multiplexer 15, an audio stream AS is constituted whichincludes the codes CT, CS and CN. The audio stream AS is furnished toe.g. a data bus, an antenna system, a storage medium etc.

FIG. 3 shows an audio player 3 suitable for decoding an audio streamAS′, e.g. generated by an encoder 1′ of FIG. 2, obtained from a databus, antenna system, storage medium etc. Unless stated, the audio player3 is as described in PCT Patent Application No. WO01/69593. In brief, insuch an player, the audio stream AS′ is de-multiplexed in ade-multiplexer 30 to obtain the codes CT, CS and CN. These codes arefurnished to a transient synthesizer 31, a sinusoidal synthesizer 32 anda noise synthesizer 33 respectively. From the transient code CT, thetransient signal components are calculated in the transient synthesizer31. In case the transient code indicates a shape function, the shape iscalculated based on the received parameters. Further, the shape contentis calculated based on the frequencies and amplitudes of the sinusoidalcomponents. If the transient code CT indicates a step, then no transientis calculated. The total transient signal yT is a sum of all transients.

The sinusoidal code CS is used to generate signal yS, described as a sumof sinusoids on a given segment. At the same time, as the sinusoidalcomponents of the signal are being synthesized, the noise code CN is fedto a noise synthesizer NS 33, which is mainly a filter, having afrequency response approximating the spectrum of the noise. The NS 33generates reconstructed noise yN by filtering a white noise signal withthe noise code CN.

In the player of FIG. 3, additional suppression of frequency regionsnear or at positions of sinusoids described by CS is applied by are-analyser 39 corresponding to the first to fourth embodiments of there-analyser 18 described above. The re-analyser therefore removesunwanted components that can be present in the noise signal yN toproduce a conditioned noise signal yN′. These unwanted components arefor example parts of tonal components that are modeled as noise in theencoder (1 or 1′). By using this method in the decoder, the noisinesscan be reduced and a better sound quality is obtained. Furthermore, thedecoder is less dependent on the performance of the noise encoding andit is less of a problem if for some reason not all tonal components areremoved from the residual signal x3/x3′ in the noise encoder.

The total signal y(t) comprises the sum of the transient signal yT andthe product of any amplitude decompression (g) and the sum of thesinusoidal signal yS and the noise signal yN′. The audio playercomprises two adders 36 and 37 to sum respective signals. The totalsignal is furnished to an output unit 35, which is e.g. a speaker.

FIG. 5 shows an audio system according to the invention comprising anaudio coder 1′ as shown in FIG. 2 and an audio player 3 as shown in FIG.3. Such a system offers playing and recording features. The audio streamAS is furnished from the audio coder to the audio player over acommunication channel 2, which may be a wireless connection, a data 20bus or a storage medium. In case the communication channel 2 is astorage medium, the storage medium may be fixed in the system or mayalso be a removable disc, memory stick etc. The communication channel 2may be part of the audio system, but will however often be outside theaudio system.

1. A method of encoding an audio signal, the method comprising the stepsof: providing a respective set of sampled signal values for each of aplurality of sequential segments; analysing the sampled signal values todetermine zero or more sinusoidal components for each of the pluralityof sequential segments; subtracting said sinusoidal components from saidsampled signal values to provide a set of values representing a firstresidual component of said audio signal; conditioning said firstresidual component of said audio signal to remove selected tonalcomponents from said first residual component and to provide a set ofvalues representing a second residual component of said audio signal;modelling the second residual component of the audio signal bydetermining noise parameters approximating the 2^(nd) residualcomponent; and generating an encoded audio stream including said noiseparameters and codes representing said sinusoidal components.
 2. Amethod according to claim 1 wherein said conditioning step comprising:providing a frequency spectrum representation for sequential segments ofsaid set of values representing said first residual component of saidaudio signal; attenuating selected frequencies within each frequencyspectrum representation; and providing a time domain representation forsaid sequential segments of frequency spectrum representations in whichsaid selected frequencies have been attenuated.
 3. A method according toclaim 2 in which said attenuating step comprises: iteratively removingpeaks of the greatest energy from said frequency spectrumrepresentations.
 4. A method according to claim 3 in which saiditerations are stopped when the energy of the removed peak is less thana given percentage of the overall energy of the frequency spectrumrepresentation from which the peak is removed.
 5. A method according toclaim 4 in which said energy level is inversely proportional to thelength of said sequential segments.
 6. A method according to claim 3 inwhich said iterations are stopped after a fixed number of iterations. 7.A method according to claim 2 in which said attenuating step comprises:removing a fixed number of peaks of the greatest energy from saidfrequency spectrum representations.
 8. A method according to claim 2 inwhich said attenuating step comprises: determining frequency values foreach of the sinusoidal components representing a sequential segmentcorresponding to the sequential segment for the frequency spectrumrepresentation; and attenuating the frequency values of said frequencyspectrum representation in the region of said frequency values for eachof the sinusoidal components.
 9. A method according to claim 2 in whichsaid attenuating step comprises: determining first energy values foreach of the sinusoidal components representing a sequential segmentcorresponding to the sequential segment for the frequency spectrumrepresentation; determining second energy values for sampled signalvalues in said sequential segment corresponding to the sequentialsegment for the frequency spectrum representation; and dividing saidfrequency spectrum representations into frequency bands according to apsycho-acoustic model; zeroing the values for frequency bands where saidfirst and second energy values are similar.
 10. A method according toclaim 9 wherein said encoded audio stream is generated with run-lengthcoding representing sequences of frequency bands where values have beenzeroed.
 11. A method according to claim 2 wherein said analysing stepcomprises generating sinusoidal codes comprising tracks of linkedsinusoidal components; and synthesizing said sinusoidal components usingsaid sinusoidal codes and wherein said subtracting step comprisessubtracting said synthesized signal values from said sampled signalvalues to provide said set of values representing the first residualcomponent of said audio signal.
 12. A method according to claim 11wherein said attenuating step comprises: determining frequency valuesfor sinusoidal components of said audio signal which were not used ingenerating said sinusoidal codes; determining if said sinusoidalcomponents were not used for reasons including: said components beingtoo short, said components being masked by other components andbudgetary reasons; and attenuating the frequency values of saidfrequency spectrum representation in the region of unused sinusoidalcomponents where said components were not used for being masked or forbudgetary reasons.
 13. A method according to claim 1 wherein saidsampled signal values represent an audio signal from which transientcomponents have been removed.
 14. Method of decoding an audio stream,the method comprising the steps of: reading an encoded audio streamincluding codes representing a noise component of an audio signal;employing said codes to synthesize said noise component of said audiosignal to produce a synthesized signal; and conditioning saidsynthesized signal to remove selected tonal components from said signal.15. Audio coder arranged to process a respective set of sampled signalvalues for each of a plurality of sequential segments of an audiosignal, said coder comprising: an analyser for analysing the sampledsignal values to determine zero or more sinusoidal components for eachof the plurality of sequential segments; a subtractor for subtractingsaid sinusoidal components from said sampled signal values to provide aset of values representing a first residual component of said audiosignal; a conditioner for removing selected tonal components from saidfirst residual component and providing a set of values representing asecond residual component of said audio signal; a noise coder formodelling the second residual component of the audio signal bydetermining noise parameters approximating the 2^(nd) residualcomponent; and a bitstream generator for generating an encoded audiostream including said noise parameters and codes representing saidsinusoidal components.
 16. Audio player comprising: mean for reading anencoded audio stream including codes representing a noise component ofan audio signal; a synthesizer arranged to employ said codes tosynthesize said noise component of said audio signal to produce asynthesized signal; and a conditioner arranged to remove selected tonalcomponents from said synthesized signal.
 17. Audio system comprising anaudio coder arranged to process a respective set of sampled signalvalues for each of a plurality of sequential segments of an audiosignal, and an audio player, said audio coder comprising: an analyserfor analysing the sampled signal values to determine zero or moresinusoidal components for each of the plurality of sequential segments;a subtractor for subtracting said sinusoidal components from saidsampled signal values to provide a set of values representing a firstresidual component of said audio signal; a conditioner for removingselected tonal components from said first residual component andproviding a set of values representing a second residual component ofsaid audio signal; a noise coder for modelling the second residualcomponent of the audio signal by determining noise parametersapproximating the 2^(nd) residual component; and a bitstream generatorfor generating an encoded audio stream including said noise parametersand codes representing said sinusoidal components, and said audio playercomprising: mean for reading an encoded audio stream including codesrepresenting a noise component of an audio signal; a synthesizerarranged to employ said codes to synthesize said noise component of saidaudio signal to produce a synthesized signal; and a conditioner arrangedto remove selected tonal components from said synthesized signal.